ENCODE
Introduction: What is ENCODE? ENCODE is the Encyclopedia of DNA Elements. The project goal is to identify all functional elements in the human genome. The completion of the Genome Project Human Genome Project in 2003 was a landmark success for the field of biology. While this project provided vast amounts of raw data, it told researchers little about the funcitons and roles played by various regions of the genome. Furthermore, the exome, or protein-encoding genes (approximately 20,000 for humans) was found to only make up less than 2% of our DNA. Could most of our genetic code truly be "junk" DNA? Are there any sequences in this non-coding region that actually play a role in influencing the biology of an organism? Questions like these sparked interest in a large-scale endeavor to explore the functionality of the human genome beyond the protein-encoding exome, and thus ENCODE was born. History of ENCODE ENCODE Phase 1 The ENCODE project represented a vast undertaking in the field that would require careful implementation of the most advanced technologies available to scientists. For this reason, the intial phase of the project invovled an in-depth evaluation of various methodologies to determine the most efficient way to utilize them during the upcoming stages of the ENCODE project. Specifically, Phase 1 aimed to develope new high throughput methods to identify functional elements, as well as chromatin immunoprecipitation techniques as well as quantitative PCR. ENCODE Phase 2 needed Future of ENCODE According to a recent News Feature in Nature, researchers are in agreement that they are far from finished. Having mapped half of the data and characterized 10% of this information, ENCODE forsees a third phase to the project describing it as the "build-out" phase. The third phase will attempt to provide even more specific information detailing a "human instruction manual". To accomplish this more experiments will be done on a variety of other cell types. Researchers have shown interest in using cells taken directly from a person in future experiments, while considering the challenges this may present. There has also been interest in assessing how individual genetic variation affects the activity of regulatory elements in the genome. Lastly, exploring the interactions between the genome in 3-D space has been an area of intrigue. (http://www.nature.com/news/encode-the-human-encyclopaedia-1.11312) What data are included within ENCODE? *Transcription factor *Regulatory DNA switches *Chromatin patterns (histone binding, etc.) *Regions affecting DNA folding in chromatin *Initla comparison was identifying regions of conservation with mouse genome *Identify regions subject to active selection *ENCODE is also being expanded to other model organisms How are the ENCODE data generated, validated and accessed? Encode has defined standards for collecting and processing each data type. These data standards can be found at a public encode portal: http://genome.ucsc.edu/ENCODE/dataStandards.html The validation of ENCODE data is mediated by a quality assurance team at the Data Coordination Center. Each data are proved before they are released to the public. To verify an experiment, it is necesary to gain two highly concordant biological replicates, which have been obtained with the same experimental technique. Encode also provides validation at multiple levels, which can be supported by cross-correlation between disparate data types (f.e. parallel analysis of the same biological samples with alternate detection technologies) reference: http://www.plosbiology.org/article/info%3Adoi%2F10.1371%2Fjournal.pbio.1001046 What is the scientific utility of the data contained within ENCODE? Summary of results from ENCODE On 5 September, 2012, 30 papers on the ENCODE project were released, encompassing the work of 442 named scientists. A summary of all of the work was posted by the journal Nature, and can be found here. The papers assigned some biological or biochemical function to 80% of the genome. *Redefined what a gene is *"wiring pattern" between regulatory proteins and their activity *Intergenic regions are more important than was previously thought Controversy regarding ENCODE The biggest controversy arises from the lack of common definition for “functional”. By stating that 80% of the genome is functional, doesn’t imply that 80% is actively doing something in terms of gene expression. By function they also mean example binding sites. Doesn’t mean that the piece is necessary for the survival of the organism. There is an off-marketing of the ENCODE project. A link to a blog written by an evolutionary biologist at UC Berkeley "discussing" the ENCODE media can be found here . Articles about ENCODE in scientific journals #ENCODE Project Consortium (2004) The ENCODE (ENCyclopedia Of DNA Elements) Project. Science 306, 636-640. PMID 15499007. Beginnings of ENCODE. #ENCODE Project Consortium (2007) Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. PMID 17571346. Summary of results of the ENCODE Pilot Project. #ENCODE Project Consortium (2011) A user's guide to the encyclopedia of DNA elements (ENCODE). PLoS Biol 9, e1001046. PMID 21526222. Nicely-organized summary of the ENCODE project from inception to 2011, written somewhat in the form of a user guide. #Maher, B. (2012) ENCODE: The human encyclopedia. Nature 489, in press. News about ENCODE from lay media #Kolata, G (2012) Bits of Mystery DNA, Far From 'Junk', Play Crucial Role. New York Times online (posted 2012/09/05). Another excellent starting point, from a well-respected science writer. The reader comments are alse very worthwhile. #Saey, TH (2012) Team releases sequel to the human genome.Science News online (posted 2012/09/05) #Walsh, F (2012) Detailed map of genome function BBC News online (posted 2012/09/05) #Yong, E (2012) ENCODE: the rough guide to the human genome. Not Exactly Rocket Science blog (posted 2012/09/05). An excellent and very well-written introduction to the new ENCODE results. Great place to get started. Opinion about ENCODE from scientists Birney, E (2012) ENCODE: My own thoughts. Ewan's Blog; Bioinformatician at Large. (posted 2012/09/05) Eisen, M (2012) This 100,000 word post on the ENCODE media bonanza will cure cancer. Michael Eisen's blog (posted 2012/09/06). Additional resources National Human Genome Research Institute, 2012. The ENCODE Project: ENCyclopedia of DNA Elements. NIH home page for the ENCODE project. Good description of project goals, history and funding. ENCODE Project Consortium & Nature (journal). Web Portal to ENCODE 2012 Results. UC Santa Cruz Genome Browser Project. (2012) ENCODE Data via UCSC Genome Browser University of Washington, Encyclopedia of DNA elements compiled; UW a key force in Project ENCODE. UW article on the ENCODE Project, this article has nice descriptions of several projects. Science (online). (2012). Transcript of ENCODE discussion with Ewan Birney and John Stamatoyannopoulos.